Warfarin sensitivity is associated with increased hospital mortality in critically Ill patients

Background Warfarin is a widely used anticoagulant with a narrow therapeutic index and large interpatient variability in the therapeutic dose. Warfarin sensitivity has been reported to be associated with increased incidence of international normalized ratio (INR) > 5. However, whether warfarin sensitivity is a risk factor for adverse outcomes in critically ill patients remains unknown. In the present study, we aimed to evaluate the utility of different machine learning algorithms for the prediction of warfarin sensitivity and to determine the impact of warfarin sensitivity on outcomes in critically ill patients. Methods Nine different machine learning algorithms for the prediction of warfarin sensitivity were tested in the International Warfarin Pharmacogenetic Consortium cohort and Easton cohort. Furthermore, a total of 7,647 critically ill patients was analyzed for warfarin sensitivity on in-hospital mortality by multivariable regression. Covariates that potentially confound the association were further adjusted using propensity score matching or inverse probability of treatment weighting. Results We found that logistic regression (AUC = 0.879, 95% CI: 0.834–0.924) was indistinguishable from support vector machine with a linear kernel, neural network, AdaBoost and light gradient boosting trees, and significantly outperformed all the other machine learning algorithms. Furthermore, we found that warfarin sensitivity predicted by the logistic regression model was significantly associated with worse in-hospital mortality in critically ill patients with an odds ratio (OR) of 1.33 (95% CI, 1.01–1.77). Conclusions Our data suggest that the logistic regression model is the best model for the prediction of warfarin sensitivity clinically and that warfarin sensitivity is likely to be a risk factor for adverse outcomes in critically ill patients.


Introduction
Warfarin is the most widely used oral anticoagulant worldwide. However, it has a narrow therapeutic window and large interpatient variability and incorrect warfarin dosing is associated with increased risk of bleeding or thromboembolism [1]. As a result, it is one of the leading common drugs implicated in emergency department visits and an important cause of drug-related mortality [2,3]. Polymorphisms in cytochrome p450, family 2, subfamily C, polypeptide 9 (CYP2C9), and vitamin K epoxide reductase complex, subunit 1 (VKORC1) have been reported to be independently correlate with warfarin therapeutic dose [4][5][6]. Genetic variants in those two genes account for approximately 30% (20%-25% for VKORC1 rs9923231; 5%-10% for CYP2C9) of the interpatient warfarin dose variability [4][5][6][7]. Many pharmacogenetic algorithms have been developed to predict the individual warfarin dose by integrating clinical, demographic, and genetic variables [8][9][10][11]. Due to the strong genetic effects on warfarin dose, the U.S. Food and Drug Administration (FDA) issued the warfarin product label to instruct how to initiate the individualized dose based on combined genetic variants of CYP2C9 and VKORC1 [12]. Despite these effects, a classification for warfarin responses in patients to reflect the genetic influence is needed. Recently, a previous study proposed a classification of warfarin sensitivity based on combined polymorphisms of CYP2C9 and VKORC1 and found that the average incidence of international normalized ratio (INR) >5 in the sensitive and very sensitive combined group was nearly 2-fold more frequent than that in the normal group, suggesting warfarin sensitive patients are more prone to bleeding complications [13]. However, whether warfarin sensitivity is a risk factor for adverse outcomes in critically ill patients remains unknown.
Machine learning that has been gradually appreciated and applied into clinical use, is to devise models and algorithms that lend themselves to prediction without being explicitly programmed. Data is fed to the machine learning algorithm, and the algorithm builds logic based on the data given. Machine learning models have been shown to easily identify trends and patterns, and handle multi-dimensional, multi-variety data and non-linear relationship [14]. In our previous study, we developed a clinical algorithm to predict the warfarin sensitivity based on logistic regression [13]. However, the performance of different machine learning algorithms for the prediction of warfarin sensitivity has yet to be determined for clinical applications.
In the present study, we aimed to evaluate the utility of different machine learning algorithms for the prediction of warfarin sensitivity and then to investigate the impact of warfarin sensitivity predicted using the best performance model on the outcomes of critically ill adult patients.

Study population
A total of 106 qualified patients with various cardiovascular diseases on warfarin therapy in the Easton cohort (S1 File) has been reported [13]. This study protocol was approved by Copernicus Group Institutional Review Boards with a waiver of informed consent for retrospective analysis of de-identified data. IWPC Cohort has been described previously [9,15]. Expanded dataset was downloaded from the PharmGKB website (http://www.pharmgkb.org/downloads/), which contains pooled data on 6922 chronic warfarin users recruited through collaborative efforts of 22 research groups from 4 continents. This data set includes detailed de-identified curated data on demographic factors, clinical features, such as age, weight, height, and concomitant use of amiodarone, as well as CYP2C9 and VKORC1 genotypes. Missing values for height and weight were imputed with multivariate linear regression models. Specifically, weight, race, and sex were used for the imputation of the height variable, while height, race, and sex were used for the weight variable. For missing values of the VKORC1 rs9923231, the imputation strategy has been described [9], which is based on linkage disequilibrium in VKORC1 and race. We excluded those that did not have warfarin stable dose, missing age or lacking CYP2C9 and VKORC1 rs9923231 genotypes after imputation. We also excluded 17 subjects with CYP2C9 � 5, � 6, � 11, � 13 and � 14, due to low allele frequency and an outlier subject with warfarin stable dose 315 mg/week. A total of 5444 subjects were included in this study.
The dataset for investigating the impact of warfarin sensitivity on the outcomes of critically ill adult patients was extracted from the Medical Information Mart for Intensive Care IV (MIMIC-IV) version 0.4. MIMIC-IV is a large and freely available database containing deidentified health-related data associated with patients that stayed in critical care units at the Beth Israel Deaconess Medical Center between 2008 and 2019 [16]. The database was approved by the Institutional Review Boards of the Massachusetts Institute of Technology. One author (ZM) was given permission to extract data from MIMIC-IV. Patients who used warfarin during hospitalization were eligible for inclusion. For patients with multiple ICU admissions, we only included the first ICU stay. We excluded patients who were younger than 18 years old or ICU length of stay less than 1 day. The primary outcome was 28-day mortality from the date of ICU admission. Missing values for height were imputed as in the IWPC cohort.

Warfarin sensitivity
Warfarin sensitivity has been defined in our previous study based on the FDA warfarin label (S1 Table) [12]. Briefly, VKORC1 G/G; CYP2C9 � 1/ � 1, VKORC1 G/G; CYP2C9 � 1/ � 2 and VKORC1 A/G; CYP2C9 � 1/ � 1 were three compound genotypes for warfarin normal responders. The rest 15 compound genotypes were deemed warfarin sensitive including sensitive and very sensitive groups. In this study, warfarin sensitivity (normal or sensitive) was used as a categorical variable.

The logistic model for predicting warfarin sensitivity
The logistical model to predict warfarin sensitivity followed our previously developed regression equation [13]: Probability (P) = 1- where exp is the exponential function; height in cm; weight in kg; Age in decades; Black = 1 if race is Black, otherwise 0; White = 1 if race is White, otherwise 0; Missing or mixed race = 1 if race is unspecified or mixed, otherwise 0; amiodarone = 1 if patient taking amiodarone, otherwise 0. If P > 0.4, warfarin response is sensitive. It has been shown in our previous publication [13] that the accuracy, sensitivity, and specificity in the logistic regression is better with a threshold of probability > 0.4 for warfarin sensitivity. In the MIMIC-IV dataset, warfarin sensitivity was predicted by using the above logistic model. Warfarin stable dose was estimated from the most frequent daily dose prescribed during hospitalization.

Statistical analysis
Implementation of machine learning algorithms and parameters. In the IWPC data set, we randomly chose 80% of the eligible patients (for a total of 4355) as the derivation cohort (training data set) to train various classifiers. The remaining 20% of the patients (N = 1089) were reserved as the validation cohort (testing data set) to calculate the estimates of correct classification rates. The variables were initially identified based on reported pharmacogenetic dosing algorithm [9], including warfarin stable dose, height, weight, race, age, and use of amiodarone. For the cross-validation (CV), the IWPC data set was randomly partitioned into 5 equal parts (folds). Thus, in each iteration, a model was trained on all but one held-aside folds and then tested on the held-aside fold of the data. The iteration was repeated 5 times and each fold served as a test data set to evaluate the model performance. For model quality assessment, the areas under the ROC Curve (AUC), the overall prediction accuracy and F1 score of the 5-fold CV were evaluated.
Multivariable regression. To investigate the potential impact of warfarin sensitivity on the primary outcome, multivariable regression was applied. Clinically relevant confounders including age, height, weight, gender, race, service units, simplified acute physiology score (SAPS) II, sequential organ failure assessment (SOFA) score, interventions (mechanical ventilation, vasopressor use, sedative use), comorbidities (congestive heart failure (CHF), coronary heart disease (CAD), asthma, chronic obstructive pulmonary disease (COPD), endocarditis, atrial fibrillation, chronic renal disease, chronic liver disease, respiratory failure, ARDS, pneumonia, stroke, and malignancy), clinical lab tests on first day of ICU stay (hemoglobin, platelet, WBC, bicarb, BUN, creatinine, chloride, sodium, potassium) and vital signs (mean blood pressure, respiratory rate, temperature, SpO2) were entered into a multivariate logistic regression model as covariates.
Propensity score matching (PSM) and inverse probability of treatment weighting (IPTW). To account for the potential confounders associated with the predicted warfarin sensitivity and to ensure the robustness of our results, propensity score matching was used based on variables as described in the multivariable regression. Propensity scores for each patient were estimated by a multivariate logistic regression model. The matched cohort was created at a 1:1 with a caliper size of 0.05. Using the estimated propensity scores as weights, the IPTW method [19] was used to generate additional weighted cohort (IPTW cohort). The balance between covariates was evaluated by estimating standardized mean differences (SMD). SMD < 0.1 is considered a negligible group imbalance.
Due to the skewed distribution (with a longer tail at high doses) of warfarin dose, we transformed the raw dose into the square root of the dose. For differences in continuous variables, warfarin stable dose, height, and weight between the derivation and validation cohorts were compared with the Wilcoxon rank-sum test. For categorical variables, Fisher exact test was used in CYP2C9 allele frequencies, χ 2 tests were used for VKORC1 rs9923231 genotype, age, and race. All the quantitative data are presented as means with 95% confidence intervals (CI) or medians with interquartile ranges. P values < 0.05 were considered to be statistically significant. All statistical analyses were conducted with R (version 3.6.1).

Basic characteristics of study population
The characteristics of the patients in the IWPC and Easton cohorts are shown in S2 Table. In the Easton cohort, of 106 patients on long-term warfarin therapy for thromboembolic disorders and other cardiovascular diseases were included for analyses with complete clinical and genotype data. The median warfarin stable dose was 27.5 mg/week. In the IWPC cohort, 5444 patients were included for analyses with a median warfarin stable dose of 28.0 mg/week.
In the MIMIC-IV cohort, 19,007 patients were prescribed warfarin during hospitalizations and 10,823 patients were admitted to ICU with first stays. Based on exclusion criteria, a total of 7,647 critically ill patients were enrolled in the final cohort. There were 3833 patients predicted to be warfarin normal sensitivity and 3814 patients deemed as warfarin sensitive by our logistic model. The flow diagram of patient selections was shown in Fig 1. The baseline characteristics of MIMIC-IV cohort were summarized in Table 1.

Performance of the different algorithms
Previously, we developed a clinical algorithm to predict warfarin sensitivity based on logistic regression [13]. To determine whether the performance of different machine learning algorithms is better than logistic regression, the same features were selected to train the various  prediction classifiers using the derivation data set from the IWPC cohort. There was no statistically significant difference between the derivation and validation data set (S2 Table). The maximal predictive performance evaluated with the validation data set was obtained with the logistic regression (AUC = 0.868) and SVC (AUC = 0.868), followed by NN (AUC = 0.864) and LGBT (AUC = 0.857), and AB (AUC = 0.853) (Fig 2). The algorithms RF, ET, GNB and KNN resulted in the least favorable results (Fig 1). To obtain robust results, the model performance was evaluated by 5-fold CV (Table 2)

Feature importance
To investigate the importance of selected features in the prediction models, the relative feature importance was ranked by random forest and AdaBoost algorithms using the IWPC cohort. The warfarin stable dose was the most important variable to predict warfarin sensitivity (Fig 3).

Validation of the performance
To fully utilize the IWPC data set, we pooled the derivation and validation cohorts to train the nine different classifiers, and then validated the performance of the various machine learning algorithms using the external Easton cohort. As shown in Table 3, consistent with the results in the IWPC cohort, the classifier logistic regression (AUC = 0.835; Accuracy = 0.802; and F1 score = 0.759) was non-inferior to SVC, NN, AB and LGBT (P > 0.05), and better than all the other algorithms (P < 0.05).

Warfarin sensitivity and hospital mortality
To determine the impact of warfarin sensitivity on critically ill patients during hospitalization, 28-day mortality since ICU admission was designated as the primary outcome. Of the 3814 in the warfarin sensitive group, primary outcome events occurred in 158 patients (4.14%), compared with 117 of 3833 (3.05%) in the warfarin normal group. In the multivariate logistic regression analyses, after adjusting age, height, weight, gender, race, service unit, SAPS score, SOFA score, interventions, comorbidities, clinical lab tests and vital signs on admission to ICU, warfarin sensitivity was significantly associated with higher primary outcome events (OR, 1.33; 95% CI, 1.01-1.77; P = 0.045) compared to normal warfarin sensitivity. To account for confounding that could lead to the protective association between warfarin sensitivity and the primary outcome, we further performed PSM and propensity score-based inverse probability of treatment weighting (IPTW) analyses. In the PSM matched cohort, after adjusting the above variables, there was an increased risk of primary outcome events in the warfarin sensitive group (adjusted OR, 0.1.49; 95% CI, 1.09-2.06; P = 0.014) (Fig 4). Similarly, in the IPTW matched cohort, the adjusted OR for the primary outcome was 1.38 (95% CI, 1.02-1.87; P = 0.038).

Discussion
Direct oral anticoagulants (DOACs), also known as non-vitamin K antagonist oral anticoagulants (NOACs), which have a wide therapeutic window, thereby facilitating fixed dosing in adults without the need for laboratory monitoring or dose adjustments for body weight, are now available as a possible alternative to warfarin. However, warfarin has proven efficacy, low cost, and years of physician experience compared with DOACs [20]. Warfarin offers superior efficacy compared to DOACs in high-risk patients with antiphospholipid syndrome and mechanical valves. The clinical trial for the use of rivaroxaban versus warfarin in patients with antiphospholipid syndrome was terminated prematurely and showed that an increased rate of

PLOS ONE
events with rivaroxaban compared with warfarin [21]. The RE-ALIGN trial for the use of dabigatran versus warfarin in patients with mechanical heart valves was also terminated prematurely due to an excess of both thromboembolic events and bleeding events among patients in the dabigatran group [22]. In addition, warfarin may be a superior option for patients with a history of medication nonadherence or morbid obesity. Therefore, warfarin will remain an important and frequently used anticoagulant.
The individual response to warfarin is highly variable, being greatly influenced by genetic variants, for example, in CYP2C9 and VKORC1. In view of the importance of genetic influence, we have proposed a classification for the individual response to warfarin and created a simple algorithm to predict warfarin sensitivity based on logistic regression [13]. Logistic regression is a widely used traditional statistical approach, but it is liable to 'sparse-data biases' in case of low cardinality of records especially when training and testing procedures are applied. Recently, machine learning algorithms are gradually appreciated and applied into clinical use in general medicine [23,24]. In data analytics, machine learning is to devise models and algorithms that lend themselves to prediction. Machine learning can handle complex, nonlinear relationships among variables and deal with many sources of inferential trouble such as outliers and collinearity compared to linear methods, for instance, multivariable linear regression. It is therefore very necessary to compare various machine learning algorithms with logistic regression for the prediction of warfarin sensitivity. In this study, using distinct machine learning algorithms, we developed eight models with the same variables as in the logistic regression model to predict warfarin sensitivity. Intriguingly, we found that the models produced by logistic regression was indistinguishable from SVC with a linear kernel, NN, AdaBoost and LGBT and significantly outperformed all the other algorithms in both the IWPC cohort and the external Easton cohort. This is consistent with the IWPC study, in which linear regression outperforms other machine learning-based algorithms for the prediction of warfarin maintenance dose [9]. The result has also been confirmed in Chinese patients that the performance of the linear regression model is superior [25]. This is probably attributed to the fact that machine learning algorithms excel at complex and non-linear models with many independent variables. Because the logistic regression model uses fewer variables, is easy to be implemented in a clinical setting and has the superior performance, it is the best model for the prediction of warfarin sensitivity clinically. However, the models in our study were relatively simple, only six variables included in our models for the prediction of warfarin sensitivity. Additional potential variables affecting the prediction of warfarin sensitivity were not included in the model, such as comorbidities, additional drugdrug interactions, and patient behaviors, including diet, exercise, and compliance. With more variables integrated into models, machine learning based models are likely to have better performance at the risk of overfitting.
It has been shown that predicted warfarin maintenance dose decreases as with the increase of age [26]. In the present study, we revealed that age was an important factor in the prediction of warfarin sensitivity. These data indicate age is linked to warfarin maintenance dose and sensitivity. In addition, we have shown that Asians, Whites, and Blacks have different polymorphism profiles of CYP2C9 and VKORC1 in the IWPC cohort [13]. In line with this, we found that race was another important variable to predict warfarin sensitivity.
Lastly, we applied the logistic regression model to predict warfarin sensitivity on critically ill patients in the MIMIC-IV database. Intriguingly, warfarin sensitive patients with critical illnesses were significantly associated with worse in-hospital mortality, compared to warfarin normal patients. This result suggests that warfarin sensitivity is a risk factor in critically ill patients.
Several limitations are present in our study. First, missing genotypes of VKORC1 were derived based on linkage disequilibrium in the IWPC cohort. Missing values for height and weight were also imputed using multivariate linear regression. Errors could have been introduced in the study despite these generally reliable imputation strategies. Second, although potential confounding factors were attempted to balance and control by multiple variable adjustments and propensity score matching, due to the inherent nature of retrospective studies, residual confounders were likely to exist and could not be balanced in the analysis of the association of warfarin sensitivity and in-hospital mortality.

Conclusions
We evaluated the utility of different machine learning algorithms for the prediction of warfarin sensitivity and found that logistic regression was indistinguishable from other machine algorithms such as SVC with a linear kernel, NN, AdaBoost and LGBT. We found that the logistic regression model is the best model for the prediction of warfarin sensitivity clinically. We also demonstrated that warfarin sensitivity predicted by the logistic regression model was significantly associated with increased in-hospital mortality in critically ill patients, suggesting warfarin sensitivity may be a risk factor for adverse outcomes in critically ill patients.
Supporting information S1